Skip to content

Conversation

@inv-jishnu
Copy link
Contributor

@inv-jishnu inv-jishnu commented Nov 14, 2025

Description

This PR updates the startExport method to ensure proper and reliable shutdown of the ExecutorService used during data export. I have created this PR based on @komamitsu san's comment #3146 (comment) from another PR.
Previously, the executor was shut down only after the export loop completed successfully. If an exception occurred before reaching that point, the executor would never be shut down, potentially causing thread leaks and keeping the JVM alive longer than intended.

The updated implementation moves the executor shutdown and termination logic into a finally block, ensuring that the executor is shut down in all scenarios.

This fix improves resource safety, prevents thread leaks, and aligns the code with best practices for managing thread pools.

Related issues and/or PRs

NA

Changes made

  • Moved executorService.shutdown() and awaitTermination(...) logic into a finally block.
  • Added handling for potential InterruptedException during awaitTermination inside the finally block.
  • Moved processFooter also to the finally block

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

NA

Release notes

Improve ExecutorService lifecycle handling by moving shutdown to finally block

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @inv-jishnu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the startExport method's robustness by ensuring the ExecutorService is reliably shut down. Previously, an ExecutorService might not terminate if an exception occurred before its shutdown logic was reached, leading to resource leaks. By relocating the shutdown and termination logic into a finally block, the system now guarantees proper resource cleanup in all execution paths, thereby preventing thread leaks and aligning with best practices for managing thread pools.

Highlights

  • ExecutorService Shutdown Reliability: The executorService.shutdown() and awaitTermination() calls have been moved into a finally block within the startExport method. This ensures that the ExecutorService is always properly shut down, even if exceptions occur during the export process.
  • InterruptedException Handling: Explicit InterruptedException handling has been added around awaitTermination within the finally block. This correctly restores the interrupt status of the current thread and logs the interruption, preventing potential thread leaks and improving robustness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly identifies the need to ensure the ExecutorService is always shut down and moves the shutdown logic to a finally block. This is a good improvement for resource safety. However, the change in its current form introduces a critical race condition by calling processFooter before awaitTermination. This could lead to corrupted output, especially for JSON files. I have left comments with suggestions to correct the order of operations, ensuring the footer is written only after all export tasks have completed.

logger.error("Timeout occurred while waiting for tasks to complete");
// TODO: handle this
}
processFooter(exportOptions, tableMetadata, bufferedWriter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This call to processFooter introduces a race condition. It's now executed before the tasks in the ExecutorService are guaranteed to be complete. For formats like JSON, this can lead to a corrupted output file (e.g., a closing ] written before all objects). This line should be removed. The call should be moved into the finally block after awaitTermination succeeds, as I've suggested in another comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made changes based on feedback in e84bc49.

Comment on lines 113 to 122
try {
if (executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS)) {
logger.info("All tasks completed");
} else {
logger.error("Timeout occurred while waiting for tasks to complete");
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
logger.error("Interrupted while waiting for executor termination", e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

As mentioned in my other comment, processFooter should be called here, after awaitTermination has successfully completed. This ensures all data has been written before the footer is added. The suggestion below moves the call here and also adds the necessary handling for IOException, which processFooter can throw.

        try {
          if (executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS)) {
            logger.info("All tasks completed");
            processFooter(exportOptions, tableMetadata, bufferedWriter);
          } else {
            logger.error("Timeout occurred while waiting for tasks to complete");
          }
        } catch (InterruptedException e) {
          Thread.currentThread().interrupt();
          logger.error("Interrupted while waiting for executor termination", e);
        } catch (IOException e) {
          logger.error("Error writing footer: ", e);
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made changes based on feedback in e84bc49.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gemini-code-assist @inv-jishnu I think there's still a potential race condition.

Even after awaitTermination() times out and Thread.currentThread().interrupt() is called, busy threads might be still running. If those threads are still executing when processFooter() is called, we could have data corruption.

How about guarding processFooter() with executorService.isTerminated() to ensure all threads have actually completed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've raised a very important and valid point, @komamitsu. There is indeed a potential race condition in the current implementation regarding processFooter().

My initial suggestion (in comment #1) aimed to address this by placing processFooter() inside the if (executorService.awaitTermination(...)) block. This would ensure that the footer is only processed if all tasks completed successfully within the given timeout. If awaitTermination() returns true, it implies that executorService.isTerminated() would also be true at that point, meaning all tasks have indeed completed.

However, in the current code, processFooter() has been moved to be called unconditionally after the awaitTermination block (at lines 124-127 in the full patch). This reintroduces the race condition you've identified: if awaitTermination() times out (returns false) or an InterruptedException occurs, processFooter() will still be called while tasks might still be running or in an inconsistent state, potentially leading to data corruption.

To resolve this, processFooter() should be moved back into the if (executorService.awaitTermination(...)) block, similar to my original suggestion. This ensures it's only executed when all tasks are confirmed to have completed. The IOException handling for processFooter should also be included within that conditional block.

Since the processFooter call is now outside the lines highlighted by this comment (lines 112-121), I cannot provide a direct code suggestion here. However, the necessary change is to move the try-catch block for processFooter (currently at lines 124-127) to be immediately after logger.info("All tasks completed"); (line 115), and ensure its IOException is handled there.

Copy link
Contributor Author

@inv-jishnu inv-jishnu Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@komamitsu san,

I have moved processFooter section inside the if condition for the executorService.awaitTermination(...) as suggested in 27b5484.
Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inv-jishnu Let me ask this just in case. Is it okay to always call processFooter() even when something fails (e.g., fetchDataChunk() throws an exception) ?

@inv-jishnu inv-jishnu marked this pull request as draft November 14, 2025 06:12
@inv-jishnu inv-jishnu marked this pull request as ready for review November 14, 2025 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants